MILITARY PSYCHOLOGY, 23:351-364, 2011 
Copyright © Taylor & Francis Group, LLC 
ISSN: 0899-5605 print / 1532-7876 online 

DOT: 10.1080/08995605.2011.589315 


Characteristics of Valid Biographical Items 


Lawrence J. Stricker 
Educational Testing Service, Princeton, New Jersey 


David L. Alderton 


Navy Personnel Research Studies and Technology, Millington, Tennessee 


Donald A. Rock 


Educational Testing Service, Princeton, New Jersey 


This study assessed the relationships between characteristics of biographical items 
from the Armed Services Applicant Profile and the items’ validity in predicting the 
retention of enlisted military personnel. Item characteristics were appraised with 
ratings by expert judges and test takers, word and alternative counts, and response 
latencies. Item content was also appraised with ratings by expert judges. The more 
valid items involved overt behavior or experiences, dealt with discrete behavior or 
experiences, and had heterogeneous content. After controlling for item content, only 
the latter characteristic was related to validity. Item characteristics and item content 
interacted in several instances. 


Despite the long history and wide use of biographical inventories (see recent 
reviews by Breaugh, 2009; Stokes & Cooper, 2003; Stokes, Mumford, & Owens, 
1994), little is known about the characteristics of valid items. Several well-known 
guides for writing biographical items exist (Asher, 1972; Mael, 1991; Mumford 
& Owens, 1987; Mumford & Stokes, 1992; Owens, 1976; Owens, Glennon, 
& Albright, 1962), but the empirical underpinnings for their prescriptions are 
limited. 


Opinions expressed in this article are those of the authors and not necessarily of Educational 
Testing Service or the Department of the Navy. 

Correspondence should be addressed to Lawrence J. Stricker, Educational Testing Service, 
Princeton, NJ 08541. E-mail: Istricker@ets.org 


352 —STRICKER, ALDERTON, ROCK 


Several studies appraised the validity of item characteristics against behavioral 
criteria. Three of these investigations used all or most of the 10 item charac- 
teristics in Mael’s (1991) taxonomy of biographical items (Graham, McDaniel, 
Douglas, & Snell, 2002; Lefkowitz, Gebbia, Balsam, & Dunn, 1999; McManus & 
Masztal, 1999). Agreement in the findings was modest: valid items were verifi- 
able (Graham et al., 2002; McManus & Masztal, 1999), not first-hand (they did 
not ask for the respondent’s own evaluation of his or her performance or attitudes; 
Graham et al, 2002; Lefkowitz et al., 1999), and not controllable (they concerned 
the respondent’s physical and social characteristics or actions taken by others; 
Graham et al., 2002; Lefkowitz et al., 1999). 

Another study (Barge, 1987, 1988) assessed three item characteristics, includ- 
ing one in the Mael taxonomy: the valid items were discrete (they involved a single 
unique act or simple count of unique events), in agreement with the McManus and 
Masztal findings. Valid items also concerned samples rather than signs of behavior 
(Wernimont & Campbell, 1968) and had homogeneous content. 

It is noteworthy that only the McManus and Masztal investigation assessed 
validity in a high-stakes situation: the biographical inventory was used in select- 
ing applicants for employment. This distinction may be important because of 
the possibility that test takers distort their responses on self-report measures in 
such a situation, affecting the validity of items and their associations with item 
characteristics. The susceptibility of biographical inventories to distortion is well 
established, though the consequences for validity are uncertain (see the review by 
Lautenschlager, 1994). Faking good in research studies and distortion in high- 
stakes settings are distinguishable, but it is nevertheless instructive that in the 
Graham et al. study all associations of item characteristics with the items’ validity 
observed when participants were instructed to answer honestly disappeared when 
they were asked to fake good. 

One issue not addressed thus far in this itemmetric research is the potential con- 
founding of the items’ characteristics and their content (Barge, 1988; Lefkowitz 
et al., 1999). The pools of items in these studies were heterogeneous in their con- 
tent, raising the possibility that, for example, discrete items are more valid than 
nondiscrete items, because discrete items happen to concern school achievement, 
and items about school achievement are more valid than items with other content. 
A related possibility is that item characteristics and content interact. For example, 
discrete items are more valid than nondiscrete items when the content is school 
achievement but not when the content is something else. 

In view of the limited work on the validity of biographical items, especially in 
high-stakes settings, and the uncertain influence of item content on the previous 
findings, the aim of this study was to assess the link between a comprehensive 
set of potentially important characteristics of biographical items and the items’ 
empirical validity in a selection situation, controlling for the items’ content and 
assessing the interaction between content and item characteristics. 


VALID BIOGRAPHICALITEMS 353 


METHOD 


Overview 


The biographical items were 120 items from a larger pool assembled for the 
Armed Services Applicant Profile (ASAP; Trent, 1993). The ASAP, designed to 
predict the adaptability of enlisted personnel to military service, is a traditional 
biographical inventory made up of a heterogeneous collection of items chosen for 
their potential relevance to adjustment and empirically keyed. The item content 
encompasses physical involvement, school achievement, delinquency, work ethic, 
independence, and social adaptation. The items are in a multiple-choice format, 
with three to five alternatives. Fifty-item forms of the ASAP correlated .29 with 
retention at the end of enlistment—usually 48 months. 

The item characteristics were selected on the basis of previous studies and 
commentaries about the characteristics of biographical and personality items 
(Angleitner, John, & Lohr, 1986; Asher, 1972; Barge, 1987, 1988; Blaney, 
1991; Goldberg, 1968; Holden & Fekken, 1990; Holden, Fekken, & Jackson, 
1985; Johnson, 2004; Mael, 1991; Owens et al., 1962; Werner & Pervin, 1986; 
Wiggins & Goldberg, 1965). A large number of characteristics were initially 
identified. They were subsequently winnowed down by eliminating those that 
overlapped with each other or were inapplicable to biographical items, in general, 
or to the ASAP items, in particular. The final set of characteristics was measured 
by experts’ or test takers’ ratings, word and alternative counts, and test takers’ 
response latencies. 

The item content was assessed by experts. The six content areas, noted ear- 
lier, had been identified in a factor analysis of a subset of the ASAP items 
(Trent, 1993). 

The items’ validity was their associations with the retention of military recruits. 


Expert Ratings of Item Characteristics 


Four item characteristics were assessed by raters with graduate training or PhDs 
in psychology: Overt Behavior or Experience (two raters), Discrete Behavior 
or Experience (two raters), Homogeneous vs. Heterogeneous Content (eight 
raters), and Face Validity (eight raters). Raters assessed a single characteris- 
tic. Characteristics were rated on a three-point scale: Definitely, Somewhat or 
Uncertain, Not at All (scored 3, 2, 1, respectively). Each item’s mean rating on 
a characteristic was used in the analysis. 


Overt behavior or experience. This variable was suggested by Asher 
(1972). The rating devised for this study was: “Describes overt behavior or 


354 —STRICKER, ALDERTON, ROCK 


experience (e.g., took a driver’s education course, had a flat tire) rather than an 
internal state of mind (e.g., likes to drive).” 


Discrete behavior or experience. This variable was used by Barge (1987, 
1988), and the rating was adapted from his own: “Requires test takers to report 
discrete overt behavior or experience (a single instance of behavior or a single 
experience [e.g., date of last auto accident] or a simple count of instances of the 
behavior or experience [e.g., number of auto accidents]) rather than a subjective 
summary of overt behavior or experience (e.g., average miles driven per week) or 
a global evaluation of a trait (e.g., self-rating of driving ability).” 


Homogeneous vs. heterogeneous content. This variable was used by 
Barge, and the rating was adapted from his own: “Describes something (overt 
behavior or experience, or internal state of mind) that reflects a single trait 
(e.g., never being absent from school reflects dependability; worrying a lot reflects 
anxiety) rather than something that reflects several traits (e.g., being on the dean’s 
list in school reflects intelligence, motivation, etc.; being shy reflects lack of 
confidence, inadequate social skills, etc.).” 


Face validity. This variable was used by Holden and Fekken (1990). The 
rating devised for this study was: “Obviously relevant in assessing the adaptability 
of recruits to military service.” 


Other Item Characteristics 


Ambiguity. This variable was used by Gordon (1953). The rating devised 
for this study was: “Considering everything about the question (including its 
answers), how clear was it?” It was rated on a five-point scale: Extremely Clear, 
Very Clear, Somewhat Clear, Slightly Clear, Not Clear at All (scored 1, 2, 3, 4, 5, 
respectively). The raters were 137 Navy recruits in a research study. Each item’s 
mean rating was used in the analysis. 


Number of words. This variable was used by Holden and Fekken (1990). It 
is the total number of words in the item’s stem and alternatives. 


Number of alternatives. This is the number of alternatives for the item. 


Response latency. This variable was used by Holden and Fekken (1990). 
It is the time (in hundredths of a second) between when the item was presented in 
a computer administration and when the response was made. The latencies were 
obtained from 1,090 Navy recruits in a research study (Stricker & Alderton, 1999). 
Each item’s median time was used in the analysis. 


VALID BIOGRAPHICALITEMS 355 


Expert Ratings of Content 


The six content areas were assessed by three PhD-level psychologists with train- 
ing in personality. The factors’ labels and content (Trent, 1993) were used to 
define the content areas, with only minor changes for clarity (e.g., the School 
Achievement factor was relabeled School Involvement). Raters classified the 
items into one of the six content areas (or an “other” category), using a sorting 
procedure (Stricker & Rock, 1998). Each item’s consensus classification (the one 
chosen by at least two raters) was used in the analysis. (Items without a consen- 
sus classification were included in the “other” classification in the analysis.) The 
instructions follow: 


Please read each item (including its alternatives) and decide whether it appears to 
measure one of six factors: Physical Involvement, School Involvement, Delinquency, 
Work Ethic, Independence, and Social Adaptation. The factors were identified in 
factor analyses of some of these items. This is a summary of the items defining the 
factors: 

Physical Involvement. School athletic team membership, extent of athletic activi- 
ties, quality of athletic performance, preference for white-collar or blue-collar work, 
physical demands of military training, and childhood happiness. 

School Involvement. School grades, failing courses, skipping or failing grades, 
school course subjects, school club participation, attitudes toward school and teach- 
ers, and college aspirations; disciplinary actions, suspensions, expulsions, and 
authorized or unauthorized absence. 

Delinquency. Drinking, smoking, running away from home, troublemaking, and 
police/arrest involvement. 

Work Ethic. Employment status, quality and duration of employment, and job 
preference. 

Independence. Social independence, economic self-sufficiency, independent 
friends, motivation level, age, number of full-time jobs, fired from a job, and tattoos. 

Social Adaptation. Social alienation, traditional values, sociability, risk-taking, 
autonomy from parents, dominance, problem solving, flexibility, and sickness. 

If the item appears to be primarily a measure of a particular factor, put it in the 
pile for that factor. If the item does not appear to be primarily a measure of any of 
the factors or appears to measure two or more of the factors more-or-less equally 
well, put it in the “Other” pile. 


Validity 


This is the association between the item and the criterion, retention at the end 
of 21 months of service in two samples of military recruits for the four services. 
The recruits took experimental forms of the ASAP, administered with instructions 
that the inventory was being used for selection, when they applied for enlistment 


356  STRICKER, ALDERTON, ROCK 


(Trent, 1993).' Sample 1 (V = 13,501, 79% retained) and Sample 2 (N = 13,093, 
79% also retained) took different 130-item forms, including the 120 items in 
this study (64 of the latter were common to both forms). Cramer’s V (Blalock, 
1979; Hays, 1994) was computed for each item from the contingency table for 
the item’s alternatives (3 to 5) and the dichotomous criterion (retention-attrition). 
(V is a generalization of the phi coefficient for 2 x 2 contingency tables to larger 
tables—3 x 2 to 5 x 2 in this study; its values range from 0 to 1—note that it is 
nondirectional.) V was computed separately for each sample. Each item’s V was 
used in the analysis. (In cases where an item was administered to both samples, 
the mean V was used.) 


Analysis 


The interrater reliability of the expert and recruit ratings of the item characteristics 
was estimated by the intraclass correlation (Shrout & Fleiss, 1979, Case 1, for 
mean ratings). The reliability of the experts’ content classifications was estimated 
by their mean Kappa (Conger, 1980). The reliability of the item validity index 
was estimated from the product-moment correlation between the indices for the 
64 common items in the two samples. 

Product-moment intercorrelations among the item characteristics, item con- 
tent variables, and item validity index were computed. Semipartial correlations 
were computed between each of the item characteristics and the validity index, 
partialing the set of six substantive content variables (dummy coded) out of the 
item characteristic. (The “other” content variable was not used in order to avoid 
collinearity among the content variables.) The interaction between each of the 
item characteristics and the set of six content variables vis-a-vis the item valid- 
ity index was assessed by hierarchical multiple regression analyses. For statistical 
significance, the .05 alpha level was used. For practical significance, a correlation 
of .10 and semipartial correlation (sr) of .14 was used; these are “small” effect 
sizes, accounting for 1% and 2% of the variance, respectively (Cohen, 1988). 


RESULTS 


Reliability of Variables 


The interrater reliability of the ratings is reported in Table 1. The reliability of the 
expert ratings of item characteristics ranged from .78 for Face Validity to .69 for 


'The instructions were: “Responses to this questionnaire will be assessed to determine applicants’ 
suitability for military service. Applicants are not required to provide this information; however, failure 
to do so could affect an applicant’s chances of being selected for service. An overall score on the 
questionnaire may become part of your permanent military record.” 


VALID BIOGRAPHICALITEMS 357 


TABLE 1 
Interrater Reliability of Item Characteristic Ratings 


Characteristic N IGG? 


Overt behavior or experience 2 76 
Discrete behavior or experience 2 77 
Homogeneous vs. heterogeneous content 8 69 
Face validity 8 78 
Ambiguity 137 86 


*TIntraclass correlation. 


Homogeneous vs. Heterogeneous Content, and was .86 for the recruits’ rating of 
Ambiguity. The mean Kappa was .47 for the experts’ content classifications.” The 
validity indices for the two samples correlated .94. 


Correlations of Item Characteristics With Item Validity 


The intercorrelations of the item characteristics, item content variables, and item 
validity index, and the semipartial correlations of the item characteristics with the 
item validity index are reported in Table 2. Three characteristics, all expert rat- 
ings, correlated significantly with item validity: Homogeneous vs. Heterogeneous 
Content (r = —.32), Discrete Behavior or Experience (r = .29), and Overt 
Behavior or Experience (7 =. 27). That is, the more valid items had heterogeneous 
content, dealt with discrete behavior or experiences, and involved overt behavior 
or experiences. However, when item content was partialed out, only one character- 
istic correlated significantly with item validity, Homogeneous vs. Heterogeneous 
Content (sr = —.24): the more valid items had heterogeneous content. 


Interactions of Item Characteristics and Item Content With Item Validity 


The hierarchical regression analyses for each item characteristic and the set of six 
content variables are reported in Table 3. Interactions with item content were sig- 
nificant for two variables: Ambiguity (sr = .37) and Response Latency (sr = .33).° 

In order to clarify these interactions, additional hierarchical regression analyses 
were carried out for the individual content variables with significant interactions: 


?ighty-two percent of the items were classified in the six substantive content areas: Physical 
Involvement, N = 9; School Involvement, N = 24; Delinquency, N = 7; Work Ethic, N = 19; 
Independence, N = 14; and Social Adaptation, N = 25. 

3 sr is the effect size for the interaction between the item characteristics and the item content vari- 
ables: the increase in the multiple correlation with the criterion when the interaction is added to the 
multiple regression of the item characteristic and the content variables (Cohen, 1988; Cohen, Cohen, 
West, & Aiken, 2002). 


‘JUQUOD snosuasosa}9H] “SA SNOIUSSOWOPF{ ‘ONSTIOVIVYO WIA} DUO IOJ ([Ie)-OM) ‘TQ’ > d) JuROyTUSIS st UONL[eLOD [eNedrues oy] 
“AJoatoadsai ‘([1e1-OM1) STOA] [Q" Pue CO’ OY) IV JUROYTUSIS oe CZ" pu BT * JO SUOTIR[AII0D Jop10-019Z, ‘sosoyjuored ur Ieodde xapur AWprpea oy) YA 
SONSTIO}OVIVYO WOT JO SUOTIL[AIION [eNIeAIUIIS “SONSTIIIVILY WIO}T Be pT O} / SOTQELIBA puP ‘JUI}UOS W}T oe _g 0} | S2TQeLIeA OT] = N ‘2/0N 


CO" SO" AMPIRA “ST 
(€0") €O"— OZE OSOL  Aouaw ssuodsoy “pT 
SOARUIIe 
(70) 90" 60" bl 897 Jo JOquINN *¢T 
(30—-)ZI— 8h 8I OTIT 8L1e Spiom Jo JoquInNN “ZT 
(r-)9l— 09 10- FT vl OST Asiquey “| ] 
(90-)Ol-— 81 LO- Ie wOR- ty OCT AUpILA Bde ‘OT 
yuauo09 
snoousso19}0y 
CEE OO T= £6 WO LE cr ZIT. ~—sC'SA SHODUDSOWIOH] “6 
goudLiedxa 10 
(1)6T = 17- CO - LU- 9T- OI- OF IL) -96'T ~~ OLARYDQ aoIDSIC] “8 
souotiedxa 
le ce— TW l— ce— 61— tl— 6L PLT — JO JOLARYAG OAD *L 
8¢—- cw LO dW 60 PFI- FT O@- 8I- Ir ia uonejdepe [ets0g “9 
9I— O1- SI'— SO—- 60°-— CO £0 — OO TI 61 ce cr aouspuadapyy *¢ 
90- SsO- £0 PO- 90- OT £0 T0- LO C- OT Le oT OTYIO YOM “P 
seo 6ST — 60 =6SO— Soe OTT SCC CST vo 90° Aouonburyaq “¢ 
JUSUIOATOAUT 
LC 60-— 90°— CI’ -— LO- 9T- bI- 81 Cl 9C- 8 - CV Oo” 0c JOOYDS *C 
JUSUIOATOAUT 
co—- 10 80° cl Ol tO 6 66006lhUPOUCUEE COSTE (OO OCI LO SO 9 80° [eoisAyd ‘| 
ST ae ta aa II or 6 8 L 9 g P € Z ‘ds ubew BQ 


xepu| AupiyeA pue ‘sajqelep JUe]UOD We}| ‘SONse}eIeYD We} JO SUONE|E110019}U| 


é d1avh 


358 


TO > dys 
‘Aouaqe’] asuodsoy = Aouaie’y] ‘Soaneusaypy JO JaqUINN = SoANRUIAITY ‘SPIOMA JO JOQUINNY = spo, ‘AVIPTTeA 99R4 = 99K] “WUd]UOD snosuaeso19}0H, 
‘SA ShOBUISOWIOH = snosudasoWOH ‘adudLIodxy JO JOIARYOY doIOSIG: = ao1Osiq| ‘AousLIedxy JO JOIAPYIG WIAQ = W2AO ‘OZ = N ‘Alon 


uonovioyur pue 
“ONSLIO}OBIeYO W9yT 


wa so” 0" wed LO" ZO" +0" zo: 909 ‘So[qeEeA JUa}UOD WIT 
€ dais 
oSLojoRIeYyo 
wa}I pue 
00° 00° 10° 10° 00° 490" ZO" lo TIT Sa[qeLIeA JUd}UOS WI] 
z dag 
(jas) 
¥%9T «%9T #97 x97 #97 4x9C ¥x9T 4%9C c I V9 SoTqeLiva qusuoa we] 
T daig 
AouajvT SaAlvUusay]y SPp1OM AyInsiquy aODY snoauasowoy AasIS1. J49AQ Ip AOJIIPIL 
cuV 


AyplyeA WE} UWA SE|GeUeA }UE}UOD We}| Jo Jes puke soNsUEJOeIeYO Wel] Jo sesfjeuy UOIsseJHeyY jeo!yoseJelH 
€ Alavi 


359 


360  STRICKER, ALDERTON, ROCK 


TABLE 4 
Hierarchical Regression Analyses of Selected Item Characteristics and Individual Content 
Variables With Item Validity 


AR? 
Ambiguity and Ambiguity and Response Latency 
Predictor df Delinquency School Involvement and Delinquency 
Step 1 
Item content variable 1,118 A2** OF 12” 
Step 2 
Item content variable 1,117 01 .02 .00 
and item 
characteristic 
Step 3 
Item content variable, 1,116 O77" .04* .09** 


item characteristic, 
and interaction 


Note. N = 120. *p < .05, **p < .01. 


Delinquency in both analyses and School Involvement in the Ambiguity analysis. 
These analyses are reported in Table 4. All interactions were again significant: 
Delinquency with Ambiguity (sr = .26) and Response Latency (sr = .30), 
and School Involvement with Ambiguity (sr = .20). In the interactions with 
Delinquency, Ambiguity and Response Latency were negatively related to validity 
for these items (B = —.00022 and —.00001, respectively) and were unrelated for 
the other items (B = —.00001 and .00000, respectively). In the interaction with 
School Involvement, Ambiguity was negatively related to validity for these items 
(B = —.00015) and was unrelated for the other items (B = —.00001). In sum, 
unlike items with other content, the more valid Delinquency items were unam- 
biguous and responded to quickly, and the more valid School Involvement items 
were also unambiguous. 


DISCUSSION 


The central finding of this study is that the links between biographical item char- 
acteristics and the items’ validity were both limited and complex: just a few 
characteristics played any role, and they were confounded or interacted with item 
content. 

It is remarkable that only three of the eight item characteristics evaluated were 
initially related to the items’ validity, and only one continued to be related when 
item content was controlled. Although just three interactions between the item 


VALID BIOGRAPHICALITEMS 361 


characteristics and item content emerged, they had some consistency, clustering 
around two item characteristics and two item content variables. 

The findings have some similarities and differences between the few previous 
results concerning the same item characteristics. In common with the two studies 
of discreteness (Barge, 1987, 1988; McManus and Masztal, 1999), the present 
investigation observed (before controlling for content) that discrete items were 
more valid than nondiscrete items. However, this relationship disappeared after 
controlling for content. 

In contrast to the results in the single investigation that assessed homogene- 
ity vs. heterogeneity and found that heterogeneous items were more valid than 
homogeneous items (Barge, 1987, 1988), homogeneous items were less valid in 
the present study. One speculation is that this divergent outcome stems from dif- 
ferences in the criteria. The retention criterion in this study was multifaceted, 
with many determinants, and may best be tapped by heterogeneous items. The 
variety of criteria in the other investigation were more discrete and relatively less 
complex, and perhaps better captured by homogeneous items.* The homogeneity- 
heterogeneity of content may be a very relevant feature of biographical items, its 
operation dependent on the nature of the validity criterion. 

The confounding of item characteristics with content observed in this study, 
confirming previous concerns (Barge, 1987, 1988; Lefkowitz et al., 1999), and the 
interaction of these characteristics with content necessarily raise serious questions 
about the interpretability of findings from other itemmetric studies of the relations 
between characteristics of biographical items and the items’ validity (Barge, 1987, 
1988; Graham et al, 2002; Lefkowitz et al., 1999; McManus & Masztal, 1999). 
None of these investigations controlled for content or assessed interactions with 
it. Further work in this line of research clearly needs to take content into account. 

Experiments that systematically manipulate items’ characteristics and con- 
tent may be the method of choice in such research (Barge, 1988). This 
tactic can ensure that all relevant characteristics and content are ade- 
quately represented, and the effect of each characteristic and each kind 
of content, free of any confounding, can be estimated directly and readily 
interpreted. 

Whatever the direction of research in this area, it is critical to distinguish 
between data that comes from high-stakes operational settings and low-stakes 
research settings, given the potential for distortion to influence the results (Graham 
et al., 2002). 

Although the content variables were simply employed as controls in this study, 
their substantial effect on the results merits attention. This outcome runs counter 


“For each job, training performance was assessed by each of several hands-on or job knowledge 
tests, and job performance was measure by ratings on each of five dimensions. The validity was 
estimated for each of the seven or eight components separately for each job. 


362 ~STRICKER, ALDERTON, ROCK 


to the belief that subtlety, a claimed benefit of empirical keying, is a requisite 
for validity in high-stakes settings (e.g., Hough & Paullin, 1994; Vasilopoulos & 
Cucina, 2006). This finding about the importance of a small, defined set of content 
variables also adds weight to the call for a construct-oriented approach to the 
development of biographical measures (e.g., Mumford & Owens, 1987). 

The findings have some implications for devising or selecting valid biograph- 
ical items. The interaction with response latency lends support to the suggestion 
that this variable may be useful in selecting valid biographical items (Stricker & 
Alderton, 1999). And the interaction with ambiguity reinforces the standard 
practice of focusing on items that are clear. 

In interpreting the sparse findings, the source of the item pool should be consid- 
ered. The pool was not a random collection of newly minted items. Rather, they 
came originally from two inventories used operationally or in research, and the 
items were extensively screened for pertinence to military adjustment and free- 
dom from unfairness, intrusiveness, and the like. Item characteristics may have 
also played some role, at least implicitly, in this process. For these reasons, really 
egregious items are probably absent from the pool. (See Trent, 1993). 

An inevitable issue is the generalizability of these results. The item charac- 
teristics represent an array of major variables. The ASAP typifies traditional, 
empirically keyed biographical inventories. The data for the retention criterion 
are close to optimal, given the extremely large sample and their extended length 
of service. And the criterion closely matches the ASAP’s purpose. But, of course, 
there are other item characteristics, pools of biographical items, behavioral crite- 
ria for assessing their validity, and populations of test takers. Follow-up research 
is very much in order to confirm or disconfirm the nuanced contribution of 
item characteristics to the validity of biographical items that was observed in 
this study. 


ACKNOWLEDGMENTS 


We acknowledge Armando X. Estrada, Tonia S. Heffner, Deirdre J. Knapp, and 
Leonard A. White for coordinating the expert judgments of item characteristics; 
Michelle Najarian and Alexis Ying for assisting with data analysis; and Brent 
Bridgeman for reviewing a draft of this article. 


REFERENCES 


Angleitner, A., John, O. P., & Lohr, F-J. (1986). It’s what you ask and how you ask it: An item- 
metric analysis of personality questionnaires. In A. Angleitner & J. S. Wiggins (Eds.), Personality 
assessment via questionnaire (pp. 61-107). Berlin, Germany: Springer-Verlag. 

Asher, J. J. (1972). The biographical item: Can it be improved? Personnel Psychology, 25, 251-269. 


VALID BIOGRAPHICALITEMS 363 


Barge, B. N. (1987, August). Characteristics of biodata items and their relationship to validity. In 
M. D. Dunnette (Chair), Biodata in the 80’s and beyond. Symposium conducted at the meeting of 
the American Psychological Association, New York, NY. 

Barge, B. N. (1988). Characteristics of biodata items and their relationship to validity. Dissertation 
Abstracts International, 49 (10), 4599B. (UMI No. 8820469). 

Blalock, H. M. Jr. (1979). Social statistics (rev. 2nd ed.). New York, NY: McGraw-Hill. 

Blaney, P. H. (1991). Not personality scales, personality items. In W. M. Grove & D. Cicchetti 
(Eds.), Thinking clearly about psychology: Vol. 2. Personality and psychopathology (pp. 54-71). 
Minneapolis, MN: University of Minnesota Press. 

Breaugh, J. A. (2009). The use of biodata for employee selection: Past research and future directions. 
Human Resource Management Review, 19, 219-231. 

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: 
Erlbaum. 

Cohen, P., Cohen, J., West, S. G., & Aiken, L. S. (2002). Applied multiple regression/correlation 
analysis for the behavioral sciences (3rd ed.). London, England: Routledge. 

Conger, A. J. (1980). Integration and generalization of Kappas for multiple raters. Psychological 
Bulletin, 88, 322-328. 

Goldberg, L. R. (1968). The interrelationships among item characteristics in an adjective check 
list: The convergence of different indices of item ambiguity. Educational and Psychological 
Measurement, 28, 273-296. 

Gordon, L. V. (1953). Some interrelationships among personality item characteristics. Educational 
and Psychological Measurement, 13, 264-272. 

Graham, K. E., McDaniel, M. A., Douglas, E. F., & Snell, A. F. (2002). Biodata validity decay and 
score inflation with faking: Do item attributes explain variance across items? Journal of Business 
and Psychology, 16, 573-592. 

Hays, W. L. (1994). Statistics (5th ed.). Belmont, CA: Wadsworth. 

Holden, R. R., & Fekken, G. C. (1990). Structured psychopathological test item characteristics and 
validity. Psychological Assessment, 2, 35-40. 

Holden, R. R., Fekken, G. C., & Jackson, D. N. (1985). Structured personality test item characteristics 
and validity. Journal of Research in Personality, 19, 386-394. 

Hough, L., & Paullin, C. (1994). Construct-oriented scale construction—The rational approach. In 
G. S. Stokes, M. D. Mumford, & W. A. Owens (Eds.), Biodata handbook—Theory, research, and 
use of biographical information in selection and performance prediction (pp. 109-145). Palo Alto, 
CA: CPP Books. 

Johnson, J. A. (2004). The impact of item characteristics on item and scale validity. Multivariate 
Behavioral Research, 39, 273-302. 

Lautenschlager, G. J. (1994). Accuracy and faking of background data. In G. S. Stokes, M. D. 
Mumford, & W. A. Owens (Eds.), Biodata handbook—Theory, research, and use of biographical 
information in selection and performance prediction (pp. 391-419). Palo Alto, CA: CPP Books. 

Lefkowitz, J., Gebbia, M. I., Balsam, T., & Dunn, L. (1999). Dimensions of biodata items and 
their relationships to item validity. Journal of Occupational and Organizational Psychology, 72, 
331-350. 

Mael, F. A. (1991). A conceptual rationale for the domain and attributes of biodata items. Personnel 
Psychology, 44, 763-792. 

McManus, M. A., & Masztal, J. J. (1999). The impact of biodata item attributes on validity and socially 
desirable responding. Journal of Business and Psychology, 13, 437-446. 

Mumford, M. D., & Owens, W. A. (1987). Methodology review: Principles, procedures, and findings 
in the application of background data measures. Applied Psychological Measurement,11, \-31. 

Mumford, M. D., & Stokes, G. S. (1992). Developmental determinants of individual action: Theory 
and practice in applying background measures. In M. D. Dunnette & L. M. Hough (Eds.), Handbook 


364 STRICKER, ALDERTON, ROCK 


of industrial and organizational psychology (2nd ed., Vol. 3; pp. 61-138). Palo Alto, CA: Consulting 
Psychologists Press. 

Owens, W. A. (1976). Background data. In M. D. Dunnette (Ed.), Handbook of industrial and 
organizational psychology (pp. 609-644). Chicago, IL: Rand McNally. 

Owens, W. A., Glennon, J. P., & Albright, L. E. (1962). Retest consistency and the writing of life 
history items: A first step. Journal of Applied Psychology, 46, 329-331. 

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. 
Psychological Bulletin, 86, 420-428. 

Stokes, G. S., & Cooper, L. A. (2003). Biodata. In M. Hersen (Series Ed.) & J. C. Thomas (Vol. 
Ed.), Comprehensive handbook of psychological assessment: Vol. 4, Industrial and organizational 
assessment (pp. 243-268). New York, NY: Wiley. 

Stokes, G. S., Mumford, M. D., & Owens, W. A. (Eds.) (1994). Biodata handbook—Theory, research, 
and use of biographical information in selection and performance prediction. Palo Alto, CA: CPP 
Books. 

Stricker, L. J., & Alderton, D. L. (1999). Using response latency measures for a biographical inventory. 
Military Psychology, 11, 169-188. 

Stricker, L. J., & Rock, D. A. (1998). Assessing leadership potential with a biographical measure of 
personality traits. International Journal of Selection and Assessment, 6, 164-184. 

Trent, T. (1993). The Armed Services Applicant Profile (ASAP). In T. Trent & J. H. Laurence (Eds.), 
Adaptability screening for the armed forces (pp. 71-99). Washington, DC: Office of the Assistant 
Secretary of Defense (Force Management and Personnel), Department of Defense. 

Vasilopoulos, N. L., & Cucina, J. M. (2006). Faking on noncognitive measures—The interaction 
of cognitive ability and test characteristics. In R. L. Griffith & M. H. Peterson (Eds.), A closer 
examination of applicant faking behavior (pp. 305-331). Greenwich, CT: [AP—Information Age 
Publishing. 

Werner, P. D., & Pervin, L. A. (1986). The content of personality inventory items. Journal of 
Personality and Social Psychology, 51, 622-628. 

Wernimont, P. F., & Campbell, J. P. (1968). Signs, samples, and criteria. Journal of Applied 
Psychology, 52, 372-376. 

Wiggins, J. S., & Goldberg, L. R. (1965). Interrelationships among MMPI item characteristics. 
Educational and Psychological Measurement, 25, 381-397. 


Copyright of Military Psychology is the property of Taylor & Francis Ltd and its content may not be copied or 
emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. 


However, users may print, download, or email articles for individual use. 


